NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Interaction Analysis Practice Within Computational Trends: Promises and Challenges

https://doi.org/10.22318/icls2025.510113

Zhou, Mengxi; Danish, Joshua; Fonteles, Joyce Horn; Davalos, Eduardo; Ayalon, Efrat; Lane, Alicia C; S, Ashwin T; Srivastava, Namrata; Steinberg, Selena; Anton, Gabriella; et al (June 2025, The International Society of the Learning Sciences)

Full Text Available
DocParseNet: Advanced Semantic Segmentation and OCR Embeddings for Efficient Scanned Document Annotation

Mohammadshirazi, Ahmad; AFiroozsalari, li Nosrati; Zhou, Mengxi; Kulshrestha, Dheeraj; Ramnath, Rajiv (July 2024, https://doi.org/10.48550/arXiv.2406.17591)

Automating the annotation of scanned documents is challenging, requiring a balance between computational efficiency and accuracy. DocParseNet addresses this by combining deep learning and multi-modal learning to process both text and visual data. This model goes beyond traditional OCR and semantic segmentation, capturing the interplay between text and images to preserve contextual nuances in complex document structures. Our evaluations show that DocParseNet significantly outperforms conventional models, achieving mIoU scores of 49.12 on validation and 49.78 on the test set. This reflects a 58% accuracy improvement over state-of-the-art baseline models and an 18% gain compared to the UNext baseline. Remarkably, DocParseNet achieves these results with only 2.8 million parameters, reducing the model size by approximately 25 times and speeding up training by 5 times compared to other models. These metrics, coupled with a computational efficiency of 0.039 TFLOPs (BS=1), highlight DocParseNet's high performance in document annotation. The model's adaptability and scalability make it well-suited for real-world corporate document processing applications.
more » « less
Full Text Available
Masked LoGoNet: Fast and Accurate 3D Image Analysis for Medical Domain

https://doi.org/10.1145/3637528.3672069

Karimi_Monsefi, Amin; Karisani, Payam; Zhou, Mengxi; Choi, Stacey; Doble, Nathan; Ji, Heng; Parthasarathy, Srinivasan; Ramnath, Rajiv (August 2024, ACM)

Full Text Available
Exploring Artificial Intelligence Supported Interaction Analysis

https://doi.org/10.22318/icls2024.926221

Zhou, Mengxi; Fonteles, Joyce; Danish, Joshua; Davalos, Eduardo; Steinberg, Selena; Biswas, Gautam; Enyedy, Noel (June 2024, International Society of the Learning Sciences)

Full Text Available
Using network visualizations to engage elementary students in locally relevant data literacy

https://doi.org/10.1108/ILS-06-2023-0069

Zhou, Mengxi; Steinberg, Selena; Stiso, Christina; Danish, Joshua A.; Craig, Kalani (December 2023, Information and Learning Sciences)

PurposeThis study aims to explore how network visualization provides opportunities for learners to explore data literacy concepts using locally and personally relevant data. Design/methodology/approachThe researchers designed six locally relevant network visualization activities to support students’ data reasoning practices toward understanding aggregate patterns in data. Cultural historical activity theory (Engeström, 1999) guides the analysis to identify how network visualization activities mediate students’ emerging understanding of aggregate data sets. FindingsPre/posttest findings indicate that this implementation positively impacted students’ understanding of network visualization concepts, as they were able to identify and interpret key relationships from novel networks. Interaction analysis (Jordan and Henderson, 1995) of video data revealed nuances of how activities mediated students’ improved ability to interpret network data. Some challenges noted in other studies, such as students’ tendency to focus on familiar concepts, are also noted as teachers supported conversations to help students move beyond them. Originality/valueTo the best of the authors’ knowledge, this is the first study the authors are aware of that supported elementary students in exploring data literacy through network visualization. The authors discuss how network visualizations and locally/personally meaningful data provide opportunities for learning data literacy concepts across the curriculum.
more » « less
Full Text Available
Reducing manual labeling requirements and improved retinal ganglion cell identification in 3D AO-OCT volumes using semi-supervised learning

https://doi.org/10.1364/BOE.526053

Zhou, Mengxi; Zhang, Yue; Karimi_Monsefi, Amin; Choi, Stacey S; Doble, Nathan; Parthasarathy, Srinivasan; Ramnath, Rajiv (January 2024, Biomedical Optics Express)

Adaptive optics-optical coherence tomography (AO-OCT) allows for the three-dimensional visualization of retinal ganglion cells (RGCs) in the living human eye. Quantitative analyses of RGCs have significant potential for improving the diagnosis and monitoring of diseases such as glaucoma. Recent advances in machine learning (ML) have made possible the automatic identification and analysis of RGCs within the complex three-dimensional retinal volumes obtained with such imaging. However, the current state-of-the-art ML approach relies on fully supervised training, which demands large amounts of training labels. Each volume requires many hours of expert manual annotation. Here, two semi-supervised training schemes are introduced, (i) cross-consistency training and (ii) cross pseudo supervision that utilize unlabeled AO-OCT volumes together with a minimal set of labels, vastly reducing the labeling demands. Moreover, these methods outperformed their fully supervised counterpart and achieved accuracy comparable to that of human experts.
more » « less
Full Text Available
A First Step in Using Machine Learning Methods to Enhance Interaction Analysis for Embodied Learning Environments

https://doi.org/10.1007/978-3-031-64299-9_1

Fonteles, Joyce; Davalos, Eduardo; Ashwin, T S; Zhang, Yike; Zhou, Mengxi; Ayalon, Efrat; Lane, Alicia; Steinberg, Selena; Anton, Gabriella; Danish, Joshua; et al (January 2024, Springer Nature Switzerland)

Full Text Available
Understanding young children’s science learning through embodied communication within an MR environment

https://doi.org/10.1007/s11412-023-09395-z

Tu, Xintian; Danish, Joshua; Humburg, Megan; Zhou, Mengxi; Mathayas, Nitasha; Enyedy, Noel; Jen, Tessaly (June 2023, International Journal of Computer-Supported Collaborative Learning)

Abstract While there is increased interest in using movement and embodiment to support learning due to the rise in theories of embodied cognition and learning, additional work needs to be done to explore how we can make sense of students collectively developing their understanding within a mixed-reality environment. In this paper, we explore embodied communication’s individual and collective functions as a way of seeing students’ learning through embodiment. We analyze data from a mixed-reality (MR) environment: Science through Technology Enhanced Play (STEP) (Danish et al., International Journal of Computer-Supported Collaborative Learning 15:49–87, 2020), using descriptive statistics and interaction analysis to explore the role of gesture and movement in student classroom activities and their pre-and post-interviews. The results reveal that students appear to develop gestures for representing challenging concepts within the classroom and then use these gestures to help clarify their understanding within the interview context. We further explore how students collectively develop these gestures in the classroom, with a focus on their communicative acts, then provide a list of individual and collective functions that are supported by student gestures and embodiment within the STEP MR environment, and discuss the functions of each act. Finally, we illustrate the value of attending to these gestures for educators and designers interested in supporting embodied learning.
more » « less
Full Text Available
Using deep learning for the automated identification of cone and rod photoreceptors from adaptive optics imaging of the human retina

https://doi.org/10.1364/BOE.470071

Zhou, Mengxi; Doble, Nathan; Choi, Stacey S.; Jin, Tianyu; Xu, Chenwei; Parthasarathy, Srinivasan; Ramnath, Rajiv (January 2022, Biomedical Optics Express)

Adaptive optics imaging has enabled the enhanced in vivo retinal visualization of individual cone and rod photoreceptors. Effective analysis of such high-resolution, feature rich images requires automated, robust algorithms. This paper describes RC-UPerNet, a novel deep learning algorithm, for identifying both types of photoreceptors, and was evaluated on images from central and peripheral retina extending out to 30° from the fovea in the nasal and temporal directions. Precision, recall and Dice scores were 0.928, 0.917 and 0.922 respectively for cones, and 0.876, 0.867 and 0.870 for rods. Scores agree well with human graders and are better than previously reported AI-based approaches.
more » « less
Full Text Available

Search for: All records